OcrV1, Main, Exploration, bibRecord, 000A93

An image-based automatic Arabic translation system

Identifieur interne : 000A93 ( Main/Exploration ); précédent : 000A92; suivant : 000A94

An image-based automatic Arabic translation system

Auteurs : YI CHANG [États-Unis] ; DATONG CHEN [États-Unis] ; YING ZHANG [États-Unis] ; JIE YANG [États-Unis]

Source :

Pattern recognition [ 0031-3203 ] ; 2009.

RBID : Pascal:09-0262446

Descripteurs français

Pascal (Inist)
- Système automatique, Traduction automatique, Arabe, Anglais, Reconnaissance image, Reconnaissance caractère, Algorithme apprentissage, Machine vecteur support, Localisation, Reconnaissance optique caractère, Correction erreur, Segmentation, Structure donnée, Appareil portatif, Reconnaissance parole, Précision, Classification image, Traitement langage, Reconnaissance forme, Classification signal, Classification automatique, Traitement parole, Traitement image.
Wicri :
- topic : Traduction automatique.

English descriptors

KwdEn :
- Accuracy, Arabic, Automatic classification, Automatic system, Automatic translation, Character recognition, Data structure, English, Error correction, Image classification, Image processing, Image recognition, Language processing, Learning algorithm, Localization, Optical character recognition, Pattern recognition, Portable equipment, Segmentation, Signal classification, Speech processing, Speech recognition, Support vector machine.

Abstract

In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module.

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000226
to stream PascalFrancis, to step Curation: 000554
to stream PascalFrancis, to step Checkpoint: 000210
to stream Main, to step Merge: 000B04
to stream Main, to step Curation: 000A93

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">An image-based automatic Arabic translation system</title>
<author><name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">09-0262446</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 09-0262446 INIST</idno>
<idno type="RBID">Pascal:09-0262446</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000226</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000554</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000210</idno>
<idno type="wicri:doubleKey">0031-3203:2009:Yi Chang:an:image:based</idno>
<idno type="wicri:Area/Main/Merge">000B04</idno>
<idno type="wicri:Area/Main/Curation">000A93</idno>
<idno type="wicri:Area/Main/Exploration">000A93</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">An image-based automatic Arabic translation system</title>
<author><name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
<imprint><date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Accuracy</term>
<term>Arabic</term>
<term>Automatic classification</term>
<term>Automatic system</term>
<term>Automatic translation</term>
<term>Character recognition</term>
<term>Data structure</term>
<term>English</term>
<term>Error correction</term>
<term>Image classification</term>
<term>Image processing</term>
<term>Image recognition</term>
<term>Language processing</term>
<term>Learning algorithm</term>
<term>Localization</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Portable equipment</term>
<term>Segmentation</term>
<term>Signal classification</term>
<term>Speech processing</term>
<term>Speech recognition</term>
<term>Support vector machine</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Système automatique</term>
<term>Traduction automatique</term>
<term>Arabe</term>
<term>Anglais</term>
<term>Reconnaissance image</term>
<term>Reconnaissance caractère</term>
<term>Algorithme apprentissage</term>
<term>Machine vecteur support</term>
<term>Localisation</term>
<term>Reconnaissance optique caractère</term>
<term>Correction erreur</term>
<term>Segmentation</term>
<term>Structure donnée</term>
<term>Appareil portatif</term>
<term>Reconnaissance parole</term>
<term>Précision</term>
<term>Classification image</term>
<term>Traitement langage</term>
<term>Reconnaissance forme</term>
<term>Classification signal</term>
<term>Classification automatique</term>
<term>Traitement parole</term>
<term>Traitement image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Traduction automatique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Pennsylvanie</li>
</region>
<settlement><li>Pittsburgh</li>
</settlement>
<orgName><li>Université Carnegie-Mellon</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
</region>
<name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A93 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000A93 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:09-0262446
   |texte=   An image-based automatic Arabic translation system
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

An image-based automatic Arabic translation system

An image-based automatic Arabic translation system

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri